Search CORE

60 research outputs found

Large Scale Co-Regularized Ranking

Author: Heskes T.
Hofmann K.
Tsivtsivadze E.
Publication venue: Technische Universität Darmstadt, Knowledge Engineering Group
Publication date: 01/01/2012
Field of study

International Migration, Integration and Social Cohesion online publications

An efficient algorithm for learning to rank from preference graphs

Author: Airola A
Boberg J
Jarvinen J
Pahikkala T
Tsivtsivadze E
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/10/2022
Field of study

In this paper, we introduce a framework for regularized least-squares (RLS) type of ranking cost functions and we propose three such cost functions. Further, we propose a kernel-based preference learning algorithm, which we call RankRLS, for minimizing these functions. It is shown that RankRLS has many computational advantages compared to the ranking algorithms that are based on minimizing other types of costs, such as the hinge cost. In particular, we present efficient algorithms for training, parameter selection, multiple output learning, cross-validation, and large-scale learning. Circumstances under which these computational benefits make RankRLS preferable to RankSVM are considered. We evaluate RankRLS on four different types of ranking tasks using RankSVM and the standard RLS regression as the baselines. RankRLS outperforms the standard RLS regression and its performance is very similar to that of RankSVM, while RankRLS has several computational benefits over RankSVM

UTUPub

Premise Selection for Mathematics by Corpus Analysis and Kernel Methods

Author: A Grabowski
A Riazanov
A Tarski
A Trybulec
B Scholkopf
CM Bishop
Daniel Kühlwein
E Tsivtsivadze
Evgeni Tsivtsivadze
J Meng
J Shawe-Taylor
J Urban
J Urban
J Urban
Jesse Alama
Josef Urban
MD Richard
P Rudnicki
R Rifkin
S Schulz
S Shalev-Shwartz
Tom Heskes
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/04/2012
Field of study

Smart premise selection is essential when using automated reasoning as a tool for large-theory formal proof development. A good method for premise selection in complex mathematical libraries is the application of machine learning to large corpora of proofs. This work develops learning-based premise selection in two ways. First, a newly available minimal dependency analysis of existing high-level formal mathematical proofs is used to build a large knowledge base of proof dependencies, providing precise data for ATP-based re-verification and for training premise selection algorithms. Second, a new machine learning algorithm for premise selection based on kernel methods is proposed and implemented. To evaluate the impact of both techniques, a benchmark consisting of 2078 large-theory mathematical problems is constructed,extending the older MPTP Challenge benchmark. The combined effect of the techniques results in a 50% improvement on the benchmark over the Vampire/SInE state-of-the-art system for automated reasoning in large theories.Comment: 26 page

arXiv.org e-Print Archive

Crossref

Radboud Repository

BioInfer: a corpus for information extraction in the biomedical domain

Author: A Yakushiji
CF Baker
D Lin
DD Sleator
E Alphonse
E Tsivtsivadze
E Tsivtsivadze
F Ginter
Filip Ginter
G Hripcsak
H Shatkay
J Cohen
J Ding
J Kim
Jari Björne
JM Temkin
Jorma Boberg
Jouni Järvinen
Juho Heimonen
K Franzén
K Kipper
KB Cohen
KB Cohen
L Hirschman
L Salwinski
M Ashburner
N Daraselia
P Kingsbury
P Kingsbury
P Szolovits
S Aubin
S Pyysalo
S Pyysalo
S Pyysalo
S Siegel
Sampo Pyysalo
T Ohta
T Pahikkala
T Wattarujeekrit
Tapio Salakoski
TH King
Y Tateisi
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Lately, there has been a great interest in the application of information extraction methods to the biomedical domain, in particular, to the extraction of relationships of genes, proteins, and RNA from scientific publications. The development and evaluation of such methods requires annotated domain corpora. RESULTS: We present BioInfer (Bio Information Extraction Resource), a new public resource providing an annotated corpus of biomedical English. We describe an annotation scheme capturing named entities and their relationships along with a dependency analysis of sentence syntax. We further present ontologies defining the types of entities and relationships annotated in the corpus. Currently, the corpus contains 1100 sentences from abstracts of biomedical research articles annotated for relationships, named entities, as well as syntactic dependencies. Supporting software is provided with the corpus. The corpus is unique in the domain in combining these annotation types for a single set of sentences, and in the level of detail of the relationship annotation. CONCLUSION: We introduce a corpus targeted at protein, gene, and RNA relationships which serves as a resource for the development of information extraction systems and their components such as parsers and domain analyzers. The corpus will be maintained and further developed with a current version being available at

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Locality-Convolution Kernel and Its Application to Dependency Parse Ranking

Author: A. Zien
B. Scholkopf
B. Schölkopf
E. Tsivtsivadze
H. Lodhi
J. Shawe-Taylor
M. Collins
M.G. Kendall
T. Pahikkala
T. Poggio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Abstract. We propose a Locality-Convolution (LC) kernel in applica-tion to dependency parse ranking. The LC kernel measures parse similar-ities locally, within a small window constructed around each matching feature. Inside the window it makes use of a position sensitive func-tion to take into account the order of the feature appearance. The sim-ilarity between two windows is calculated by computing the product of their common attributes and the kernel value is the sum of the window similarities. We applied the introduced kernel together with Regular-ized Least-Squares (RLS) algorithm to a dataset containing dependency parses obtained from a manually annotated biomedical corpus of 1100 sentences. Our experiments show that RLS with LC kernel performs bet-ter than the baseline method. The results outline the importance of local correlations and the order of feature appearance within the parse. Final validation demonstrates statistically significant increase in parse ranking performance.

CiteSeerX

Crossref

Lexical adaptation of link grammar to the biomedical sublanguage: a comparative evaluation of three approaches

Author: A Clegg
A Mikheev
A Yakushiji
Adeline Nazarenko
C Blaschke
C Grover
DD Sleator
E Alphonse
E Tsivtsivadze
E Tsivtsivadze
F Wilcoxon
J Demšar
J Ding
J Park
J Pustejovsky
JD Kim
M Lease
P Spyns
P Szolovits
R Grishman
S Aubin
S Kulick
S Pyysalo
S Sekine
Sampo Pyysalo
Sophie Aubin
ST Ahmed
Tapio Salakoski
Y Tsuruoka
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Benchmarking natural-language parsers for biological applications using dependency graphs

Author: A Bies
AB Clegg
Adrian J Shepherd
Andrew B Clegg
B Rosario
B Srinivas
C Friedman
C Grover
C Grover
D Blaheta
D Gildea
D Klein
D Klein
D Lin
D Lin
D Sleator
DM Bikel
E Charniak
E Tsivtsivadze
EB Camon
EJ Briscoe
G Sampson
G Schneider
G Schneider
IM Goldin
J Carroll
J Carroll
J Finkel
J Xiao
JM Temkin
K Franzén
K Knight
KB Cohen
L Smith
M Collins
M Lease
MC de Marneffe
MP Marcus
N Domedel-Puig
N Ge
O Sanchez
P Merlo
PG Mutalik
S Abney
S Kübler
S Pyysalo
ST Ahmed
T Briscoe
TC Rindflesch
Y Huang
Z Shi
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria. RESULTS: Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations. CONCLUSION: Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Sparse Preference Learning

Author: Heskes T.
Tsivtsivadze E.
Publication venue: Sutcliffe : s.n
Publication date: 01/01/2010
Field of study

Contains fulltext : 83983.pdf (preprint version ) (Open Access)NIPS Workshop on Practical Application of Sparse Modeling: Open Issues and New Directions, 10 Decembe

Radboud Repository

Kernel Principal Component Ranking: Robust Ranking on Noisy Data

Author: Cseke B.
Heskes T.M.
Tsivtsivadze E.
Publication venue: [S.l.] : Pascal Lecture Series
Publication date: 01/01/2009
Field of study

Contains fulltext : 76137.pdf (preprint version ) (Open Access)European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Workshop on Preference Learning, Bled, Slovenia, 7 november 200

CiteSeerX

Radboud Repository

Large Scale Co-Regularized Ranking

Author: Heskes T.
Hofmann K.
Tsivtsivadze E.
Publication venue: [S.l. : s.n.]
Publication date: 01/01/2012
Field of study

Contains fulltext : 103471.pdf (publisher's version ) (Open Access)ECAI 2012 : Preference learning: problems and applications in AI, PL-12, August 28, 2012, Montpellier, Franc

Radboud Repository

UvA-DARE

International Migration, Integration and Social Cohesion online publications